A Contribution of Intrinsic Speech Variabilities to Errors Done by Speech Recognition
نویسنده
چکیده
A usual way of ASR accuracy evaluation is calculation of Word Error Rate (WER) and Sentence Error Rate (SER). The misrecognitions that contribute to WER are classified into three categories: deletions, insertions and substitutions. The paper presents a study about a contribution of intrinsic speech variabilities to the each of the error category. Decision tree (DT) analysis is used. Five DT styles are examined: CART, C4.5, and then Minimum Message Length (MML), strict MML and Bayesian styles decision trees. We apply these techniques to data of the computer speech recognition fed by intrinsically variable speech.
منابع مشابه
Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملOldenburg logatome speech corpus (OLLO) for speech recognition experiments with humans and machines
This paper introduces the new OLdenburg LOgatome speech corpus (OLLO) and outlines design considerations during its creation. OLLO is distinct from previous ASR corpora as it specifically targets (1) the fair comparison between human and machine speech recognition performance, and (2) the realistic representation of intrinsic variabilities in speech that are significant for automatic speech rec...
متن کاملEmotional Aspects of Intrinsic Speech Variabilities in Automatic Speech Recognition
We analyze two German databases: the OLLO database [1] designed for doing speech recognition experiments on speech variabilities, and the Berlin emotional database [2] designed for the analysis and synthesis of emotional speech. The paper tries to find a relation between intrinsic speech variabilities and the emotions. Moreover, we study this relation from the point of view of speech recognitio...
متن کاملComplementarity of MFCC, PLP and Gabor features in the presence of speech-intrinsic variabilities
In this study, the effect of speech-intrinsic variabilities such as speaking rate, effort and speaking style on automatic speech recognition (ASR) is investigated. We analyze the influence of such variabilities as well as extrinsic factors (i.e., additive noise) on the most common features in ASR (mel-frequency cepstral coefficients and perceptual linear prediction features) and spectro-tempora...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کامل